Hamilton County
Physics of Language Models: Part 3.1, Knowledge Storage and Extraction
Allen-Zhu, Zeyuan, Li, Yuanzhi
Large language models (LLMs) can store a vast amount of world knowledge, often extractable via question-answering (e.g., "What is Abraham Lincoln's birthday?"). However, do they answer such questions based on exposure to similar questions during training (i.e., cheating), or by genuinely learning to extract knowledge from sources like Wikipedia? In this paper, we investigate this issue using a controlled biography dataset. We find a strong correlation between the model's ability to extract knowledge and various diversity measures of the training data. $\textbf{Essentially}$, for knowledge to be reliably extracted, it must be sufficiently augmented (e.g., through paraphrasing, sentence shuffling) $\textit{during pretraining}$. Without such augmentation, knowledge may be memorized but not extractable, leading to 0% accuracy, regardless of subsequent instruction fine-tuning. To understand why this occurs, we employ (nearly) linear probing to demonstrate a strong connection between the observed correlation and how the model internally encodes knowledge -- whether it is linearly encoded in the hidden embeddings of entity names or distributed across other token embeddings in the training text. This paper provides $\textbf{several key recommendations for LLM pretraining in the industry}$: (1) rewrite the pretraining data -- using small, auxiliary models -- to provide knowledge augmentation, and (2) incorporate more instruction-finetuning data into the pretraining stage before it becomes too late.
Social Network Analysis and Validation of an Agent-Based Model
Pine, Karleigh, Klipfel, Joel, Bennett, Jared, Bade, Nathaniel, Manasseh, Christian
Agent-based models (ABMs) simulate the formation and evolution of social processes at a fundamental level by decoupling agent behavior from global observations. In the case where ABM networks evolve over time as a result of (or in conjunction with) agent states, there is a need for understanding the relationship between the dynamic processes and network structure. Social networks provide a natural set of tools for understanding the emergent relationships of these systems. This work examines the utility of a collection of network comparison methods for the purpose of tracking network changes in an ABM over time or between model parameters. Among the techniques examined is a novel graph pseudometric based on heat content asymptotics, which have been shown to distinguish many isospectral graphs which are not isomorphic. Additionally, we establish the use of observations about real-world networks from network science (e.g. fat-tailed degree distribution, small-world property) for ABM validation in the case where empirical population data is unavailable. These methods are all demonstrated on systematic perturbations of an original model simulating the formation of friendships in a population of 20,000 agents in Cincinnati, OH.
Baby Physical Safety Monitoring in Smart Home Using Action Recognition System
Adewopo, Victor, Elsayed, Nelly, Anderson, Kelly
Humans are able to intuitively deduce actions that took place between two states in observations via deductive reasoning. This is because the brain operates on a bidirectional communication model, which has radically improved the accuracy of recognition and prediction based on features connected to previous experiences. During the past decade, deep learning models for action recognition have significantly improved. However, deep neural networks struggle with these tasks on a smaller dataset for specific Action Recognition (AR) tasks. As with most action recognition tasks, the ambiguity of accurately describing activities in spatial-temporal data is a drawback that can be overcome by curating suitable datasets, including careful annotations and preprocessing of video data for analyzing various recognition tasks. In this study, we present a novel lightweight framework combining transfer learning techniques with a Conv2D LSTM layer to extract features from the pre-trained I3D model on the Kinetics dataset for a new AR task (Smart Baby Care) that requires a smaller dataset and less computational resources. Furthermore, we developed a benchmark dataset and an automated model that uses LSTM convolution with I3D (ConvLSTM-I3D) for recognizing and predicting baby activities in a smart baby room. Finally, we implemented video augmentation to improve model performance on the smart baby care task. Compared to other benchmark models, our experimental framework achieved better performance with less computational resources.
DeepCPG Policies for Robot Locomotion
Deshpande, Aditya M., Hurd, Eric, Minai, Ali A., Kumar, Manish
Central Pattern Generators (CPGs) form the neural basis of the observed rhythmic behaviors for locomotion in legged animals. The CPG dynamics organized into networks allow the emergence of complex locomotor behaviors. In this work, we take this inspiration for developing walking behaviors in multi-legged robots. We present novel DeepCPG policies that embed CPGs as a layer in a larger neural network and facilitate end-to-end learning of locomotion behaviors in deep reinforcement learning (DRL) setup. We demonstrate the effectiveness of this approach on physics engine-based insectoid robots. We show that, compared to traditional approaches, DeepCPG policies allow sample-efficient end-to-end learning of effective locomotion strategies even in the case of high-dimensional sensor spaces (vision). We scale the DeepCPG policies using a modular robot configuration and multi-agent DRL. Our results suggest that gradual complexification with embedded priors of these policies in a modular fashion could achieve non-trivial sensor and motor integration on a robot platform. These results also indicate the efficacy of bootstrapping more complex intelligent systems from simpler ones based on biological principles. Finally, we present the experimental results for a proof-of-concept insectoid robot system for which DeepCPG learned policies initially using the simulation engine and these were afterwards transferred to real-world robots without any additional fine-tuning.
Senior Associate, Data Engineering at dentsu international - Cincinnati, OH, United States
We innovate the way brands are built. That means we do things differently so they're better than before. In this way, we make our clients' most important marketing assets--their brands--win in a changing world. Dentsu International is a modern marketing solutions company. Our mission is to help clients navigate, progress, and thrive in a world of change.
Deep Multistage Multi-Task Learning for Quality Prediction of Multistage Manufacturing Systems
Yan, Hao, Sergin, Nurretin Dorukhan, Brenneman, William A., Lange, Stephen Joseph, Ba, Shan
In multistage manufacturing systems, modeling multiple quality indices based on the process sensing variables is important. However, the classic modeling technique predicts each quality variable one at a time, which fails to consider the correlation within or between stages. We propose a deep multistage multi-task learning framework to jointly predict all output sensing variables in a unified end-to-end learning framework according to the sequential system architecture in the MMS. Our numerical studies and real case study have shown that the new model has a superior performance compared to many benchmark methods as well as great interpretability through developed variable selection techniques.
SureFly's hybrid electric octocopter drone achieves first manned flight
The hybrid two-seat helicopter SureFly from the electric truck company Workhorse made it off the ground this week. While a few feet of hover might seem insignificant, the passenger drone startup is hailing the untethered lift-off with a pilot outside of Cincinnati, Ohio, as a huge success. For the hybrid gas- and battery-powered vertical take-off and landing vehicle (known as VTOL), this means the copter is on its way to flying with passengers inside. Once airborne, the craft will have a 75-mile range. "People want to have something in their garage to take out and fly," SureFly CEO Steve Burns said in a call Friday afternoon.
Industry 4.0 & Smart Factories ACC Software Solutions
We are on the cusp of the fourth industrial revolution, also known as Industry 4.0. With Cincinnati, Ohio declaring itself an "Industry 4.0 demonstration city," the question isn't if Industry 4.0 is coming, but rather how quickly. Technologies such as the Internet of Things (IoT), robotics and automation have ushered in this fourth age of widespread change. These technologies, along with Big Data and analytics, and cyber-physical systems sit at the heart of the Industry 4.0 model and they all run on the cloud. Over the next decade, these emerging technologies will create new real-time connections between machines, production processes, and systems that will revolutionize the way we do business.
Clonal analysis of newborn hippocampal dentate granule cell proliferation and development in temporal lobe epilepsy
Singh, Shatrunjai P., LaSarge, Candi L., An, Amen, McAuliffe, John J., Danzer, Steve C.
Hippocampal dentate granule cells are among the few neuronal cell types generated throughout adult life in mammals. In the normal brain, new granule cells are generated from progenitors in the subgranular zone and integrate in a typical fashion. During the development of epilepsy, granule cell integration is profoundly altered. The new cells migrate to ectopic locations and develop misoriented basal dendrites. Although it has been established that these abnormal cells are newly generated, it is not known whether they arise ubiquitously throughout the progenitor cell pool or are derived from a smaller number of bad actor progenitors. To explore this question, we conducted a clonal analysis study in mice expressing the Brainbow fluorescent protein reporter construct in dentate granule cell progenitors. Mice were examined 2 months after pilocarpine-induced status epilepticus, a treatment that leads to the development of epilepsy. Brain sections were rendered translucent so that entire hippocampi could be reconstructed and all fluorescently labeled cells identified. Our findings reveal that a small number of progenitors produce the majority of ectopic cells following status epilepticus, indicating that either the affected progenitors or their local microenvironments have become pathological. By contrast, granule cells with basal dendrites were equally distributed among clonal groups. This indicates that these progenitors can produce normal cells and suggests that global factors sporadically disrupt the dendritic development of some new cells. Together, these findings strongly predict that distinct mechanisms regulate different aspects